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Abstract 

We implement a cluster-update Monte Carlo algorithm to simulate magnetic dipoles of the 
XY-spin type confined in a two-dimensional plane. The long-range character and anisotropy in 
the dipole interaction are handled by using the Luijten-Blote algorithm and the Dotsenko-Selke- 
Talapov algorithm, respectively. We have checked the performance of this cluster-update algorithm 
in comparison to the Metropolis algorithm and found that it equilibrated the system faster in terms 
of the number of flipped spins, although the overall computational complexity of the problem 
remained the same. 
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I. INTRODUCTION 



Magnetism has been one of the most important subjects in physics, and many of techno- 
logical applications are based upon ordered behaviors of magnetic materials. This leads us 
to both a theoretical and practical question about how to understand collective behaviors 
observed in magnetic systems. A particularly interesting case to us is rare-earth compounds 
in which magnetic spins at low temperatures can be basically regarded as two-dimensional 
(2D) [1] and the exchange interaction is relatively weak [2j because one can easily make 
use of theoretical frameworks developed to study such continuous spin models in statistical 
physics. Let us consider a square lattice of XY-like spins governed by the dipole interac- 
tion. If the linear size of the lattice is L, the total number of spins will be N = L 2 , and the 
corresponding Hamiltonian is given as follows: 

TV 

H = JYs [(* ■ s 3 )rl - 3( Sl ■ rij )( Sj ■ rij )] /r|, (1) 

where J(> 0) represents interaction strength and the summation runs over every distinct 
spin pair of and Sj at r*j and Tj, respectively. The distance between this spin pair is 
denoted as = \ri — Vj\. Since the periodic boundary condition is employed to reduce 
unwanted boundary effects, the relative displacement between and rj is chosen as 
the one with minimal length among every possible pair of their periodic images. In case 
that more than one periodic image of a spin has the same minimal length from another 
spin, ambiguity can enter in defining the interaction within this pair due to the anisotropy 
manifested in the second term of Eq. (TjQ). To be simple, we neglect the interaction in such 
a case, and this choice does not hurt any essential properties of the system. It is notable 
that the interaction energy between Sj and Sj has an overall distance dependence as r^ 3 
and is determined by both their phase difference and the relative displacement, r^. In a 
numerical analysis, the long-range character implies computational complexity of 0(N 2 ) 
for the simple Metropolis algorithm (see, however, Ref. [3| for a possible modification of 
this approach). For this reason, it has not been easy to precisely determine the physical 
properties of the phase transition in a dipole system (see Refs. {4-6] and references therein), 
and it still remains to be investigated from an algorithmic point of view. 

In this work, we try to implement a Wolff-type single-cluster update algorithm [?| for 
a dipole system to challenge this issue. Specifically, we combine the Luijten-Blote (LB) 



algorithm jg| and the Dotsenko-Selke-Talapov (DST) algorithm jg] to analyze Eq. ([I]) and 
check the results in comparison to the Metropolis single-spin update algorithm. This work is 
organized as follows: We begin with the LB algorithm in Sec. Ill Al and the DST algorithms 
will be given in Sec. Ill Bt Then we combine them to construct the cluster- up date algorithm 
and present results in Sec. Ill CI This work is summarized in Sec. Ill II 

II. METHOD AND RESULTS 
A. Luijten-Blote Algorithm 

Let us begin with a 2D ferromagnetic system described as 

H = -J^JijSi- Sj, (2) 

where Jij = J/rfj and each index runs from 1 to N. We regard s n as an Ising spin for 
a while. If one directly applies the Wolff algorithm, the updating procedure would be as 
follows. 

1. Pick a spin randomly and add its index into a stack. 

2. Retrieve an element i from the stack. 

3. For every other spin Sj (j ^ i) in the system, add its index j into the stack with 
probability P{j = [1 — exp(— 2Jij/ksT)] 5 BitS . with the Boltzmann constant ks- 

4. If the stack is not empty, go to Step [2J Otherwise, flip the cluster. 

It is obvious that Step [3] spends time proportional to 0(N 2 ) for every retrieval if the program 
checks whether Sj = Sj for each spin pair. The idea of the LB algorithm is that we may first 
assume s« = Sj to calculate P[j = 1 — exp(—2Jij / ksT) , which is fixed throughout the whole 
computation. Note that the index % is irrelevant because every point is equivalent under 
the periodic boundary condition. Hence, instead of checking this probability for every spin, 
we can build a probability table in advance to correctly pick up possible s^'s in terms of a 
position relative to Sj. This saves time to O(NlogN), where logiV appears in looking up 



the table with the binary search [10|]. If the picked Sj does not point in the same direction as 
Si, we do not add it into the stack because we need to recover P i} j = P[j x <5 SliSj . Specifically, 
the prescription in Ref. p] can be written as follows: 



1. Assume that we have chosen Sj from which we grow a cluster. 

2. A spin s n (j 7^ i) can be chosen with probability Qo(n) = (1 — P/-J x (1 — P- 2 ) x • • • x 
(1 — Pl n _i) x P[ n . Let's say Sj is picked up by performing this step. 

3. The next spin is then selected according to a new probability distribution, Qj(n) = 
(1 - J P/ J+1 ) x (1 - P/_. +2 ) x ■ • • x (1 - P^_ x ) x P' in . This step is repeated: that is, 
whenever a spin s\ is selected, we work with Qi(n) to find the next one. This repetition 
stops when no more spins are picked up. 

By doing this, each spin Sj is chosen with its own correct probability, P-j. To demonstrate 
this, let us set b n = 1 when s n is picked up and b n = in the other case. If we denote 
Pr(b\, &2, ■ • • , b n ) as the probability of each possible event, the sum of all the probabilities 
to choose s 2 then reads as 

Pr(0, 1) + Pr(l, 1) = (1 - P; tl )Pl 2 + Pl.Pl, = Pl >2 . 

Likewise, the total probability to select s 3 is 

Pr(0, 0, 1) + Pr(0, 1, 1) + Pr(l, 0, 1) + Pr(l, 1, 1) 

= (i - (i - i>-)i>i :i + (i - p^p^pi, + PUH- p^pU + KiP^PU 

- P' 

and one can readily generalize this for any arbitrary Sj. It is also clear that this holds true 
no matter how one indexes spins as long as the index is used consistently throughout the 
calculation even though Ref. [8( makes it with respect to physical distances. 

In picking up a spin index from such a procedure, it is convenient to work with a cumu- 
lative distribution, that is, 



C 3 {n)= Qi(0 = l-exp 



-2 J *l k B T 

1=3+1 

1 n 

--ln[l-C J -(n)]= Jiilk B T = S 3 {n). (3) 

1=3+1 

and one will find which spin should be picked by comparing Cj(n) with a random number 
uniformly drawn over [0, 1). A beauty of the LB algorithm is that updating of Cj(n) with 
picking a spin hardly needs any additional computation because j only enters as a starting 
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index of the summation in Eq. (j3J) so that So(n) — So(j) = Sj(n) for 1 < j < n. In other 
words, it suffices to compute So(n) once and to memorize it for every n in advance of the 
Monte Carlo iterations. One needs to reset the starting index as j when having picked up 
spin j, and then one will get the correct Ju/ksT (accordingly, correct P/J for spin I > j 
each time by using Eq. ([3]). Note that it is most natural to define 5q(0) as zero. Then, the 
LB algorithm for solving Eq. fl2]) modifies the Wolff algorithm given above in the following 
way: 

1. Choose Si placed at a certain position, and make an array of partial sums So(n) with 
relative displacements from this chosen spin site. 

2. Pick a spin randomly and add its index into a stack. Set z as zero. 

3. Retrieve an element i from the stack and do the following: 

(a) Draw a uniform random number u G [0, 1) and find an index w that satisfies 
So(w) < —\ m(l — u) + So(z) < So(w + 1). If there is no such w, terminate this 
loop for i. Otherwise, set z as w for the next iteration. 

(b) Since w indicates only a relative position with respect to i, translate it into the 
actual position w'. 

(c) Add w' into the stack if Sj = v; go to step [3al 

4. If the stack is not empty, go to step [31 Otherwise, flip the cluster. 

For example, let us consider a 4 x 4 square lattice with periodic boundaries, where the spin 
sites (x, y) can be thus indexed by k(x, y) = x + Ay, ranging from to 15 [Fig. HJa)]. In step 
1, we may locate S{ at (x, y) = (0,0) on the lattice, so Sj = Sfc(o,o) — s o- With respect to 
this spin, one can compute Jy for every other spin Sj. We exclude self-interaction by setting 
■Ax) — 0, and should also be zero when x = 2 or y = 2 because their relative displacement 
is not unique due to the periodic boundary condition. In this way, one can readily construct 
the array So(n). Once computed in step 1, it can be used at any spin site (x,y) to calculate 
the probability to add another spin to the stack because the interaction depends only on 
the relative displacement between them, the relative position of Sj with respect to Si needs 
to be converted to the actual position on the lattice in step 3(b). Suppose that a spin at 
(1, 1), i.e., = s 5 , is retrieved from the stack. Then, the situation with this spin as a 
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FIG. 1: (a) A square lattice of size 4x4 indexed with reference to the spin at the top left position 
(x,y) = (0,0). Note that the spin with index has two different shortest paths to the spin with 
index 2 due to the periodic boundary condition, (b) The indices are translated with a new reference 
point at (x,y) = (1,1). 

reference point is equivalent to Fig. QJb) under a simple translation. We check which other 
spins can be added to the stack by comparing the random number u with Cj(n) in step 3(a) 
[see Eq. (J3J)]. If this procedure tells us to add w = 13, this correspond to spin 2 according 
to the original index in Fig. Ufa). Therefore, we should add spin 2 to the stack. Now, the 
starting index in Eq. (jSJ) is changed to w — 13. Getting back to step 3(a) and drawing a new 
random number, let's say that we get w — 15 this time. As before, comparing Figs. H^a) 
andH^b), we add spin to the stack, and go back to step 3(a). The next w > 15 must be 
beyond the valid range of spin indices, so we stop considering and retrieve another spin 
from the stack. This is repeated until the stack becomes empty. 

This algorithm can be extended to simulate XY spins as well: one should assign a 
reflection plane by randomly drawing <p e [0, 2ir) on choosing a seed of the cluster, as 
originally devised in Ref. Q. Every spin inside the generated cluster will be reflected with 
respect to this plane. We denote this reflection as an operator so that the operation 
is represented as Sj —> R^Si. Accordingly, whether a spin can be included in the cluster 
should be also determined by the energy difference due to such a reflection so that the 
added probability becomes P it j = 1 — exp [—^(R^Si — Sj) • Sj/k B T}. The LB algorithm 
for the Ising case first overestimates P{j = 1 — exp(—2Jij/ksT) and then adjust it by 
using the Kronecker delta 5 Si}Sj . For XY spins, an overestimate occurs in the same way, 
but the adjustment should be made by replacing the Kronecker delta with the probability 
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-Padd = max <! 



l-cxp^-J ij (R 4 ,s i -s i )-Sj/k B T^ 
l-exp(-2 J^/fcsT) 



B. Dotsenko-Selke-Talapov Algorithm 



The DST algorithm was devised to use cluster updates in frustrated systems. In order to 
illustrate the main idea, we consider a local version of Eq. (JT|): 



H = J si ■ Sj - 3(si ■ rij)(sj ■ r 

(ij) 



(4) 



where the summation runs over all the nearest neighbor pairs In growing a cluster C, 
one considers only the first term in Eq. (jlj) because it satisfies (R^Si) ■ (R^Sj) = Sj • Sj for 
any and does not cause any energy difference in the bulk of the cluster. Let us compare 
two spin configurations fi and v that are related by one cluster flip, i.e., Sj — > s- for every i 
so that s- = R<f,Si for i G C and s- = Sj for i ^ C. Then, the ratio of probabilities to select 
the configurations is 



exp 



J 

for 



E- 

(ij) 



exp 



J 



(i<=C,j£C) 

just as in the Wolff algorithm. For this generated cluster, we compute an additional energy 
contribution from the last anisotropic term, 



AE a = J23( Sl • rij )( Sj • r t] ) - 3(aJ • ■ r tj ) 



(i,3) 



22 3 ( S * ■ r ij)( S 3 ' r ij) - 3(-R^Si ■ r ij )(R ( j > S j ■ Ti 



+ Y 3 ( S * ' r tf)( a i ' r ij) - 3 (^ s i ■ r ij){ s j ■ fi 



(5) 



and accept this cluster move with probability P acc = min[l, exp(— AE a /ksT)}. Then the 
acceptance ratios will satisfy 



ex P (-AE a /k B T), 
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and the transition probabilities in total restore the detailed balance as 

P(fi -> v) g(p, -> v) A(fi -»■ v) 
P{y -> fi) g{v ->■ fi) A{v ->■ /i) 

J 



exp 



3 (id) 



x ex P ^ J] [ s * ' s 'j ~ 3 ( s i ' r tfX a S ' r 




Although this algorithm certainly works, one should note that the cluster growth 
does not exactly describe the given system, which means that the cluster update may 
not be helpful in overcoming critical slowing down [12I, [l3|- What usually happens is 
that a cluster grown to a large size is simply rejected at the last step, leading to an 
amount of inefficiency. Collecting AE a during the cluster growth may reduce this prob- 



lem to some extent [14J: we will check whether this cluster will be accepted every time 
G spins are added. Defining AE^ = J [3(s, ■ T*ij)(sj ■ ry) — 3(i?^Sj • J*y)(sj ■ r^)] and 

s i ' r *i) — 3(i2^Sj • r i j)(R < f,Sj ■ r^)], we write down the cluster al- 
gorithm for Eq. (j3J) as follows: 

1. Pick randomly a spin and add its index into a stack. Determine a reflection plane by 
randomly drawing <p e [0, 27r) and set a variable A-E 1 ^ as zero. 

2. Retrieve an element i from the stack. 

3. For every nearest neighbor j of i, 

(a) if j is not included in the cluster, add it into the stack with probability Pij = 
1 - exp [-J(R 4> s i - ■ Sj/k B T]. Add AE\f to AE a g . 

(b) Otherwise, add AE$ to AE a g . 

4. If G spins are added into the cluster or the stack is empty, check whether the cluster 
can be flipped with probability exp(— AE^/hsT) and then set AE g as zero. 

(a) If the answer is no, finish this Monte Carlo step. 

(b) If the stack is empty and the answer is yes, flip the cluster. 

(c) Otherwise, go back to step [2j 



7 



C. Cluster Algorithm 

We now combine the LB algorithm and the DST algorithm to solve the long-ranged 
anisotropic dipole interaction in Eq. ([T]). We generate a cluster by using the first term 
of Eq. (pp, which is the same as the LB algorithm (Sec. Ill Aft except that the interaction 
becomes antiferromagnetic due to J > 0. The DST algorithm (Sec. IIIBI) is needed to take 
the remaining terms into account. Because collecting terms during the cluster growth takes 
time of 0(N 2 ) on every retrieval from the stack, which is highly time-consuming, we examine 
the flip after fully generating a cluster at an expense of low acceptance ratio. The algorithm 
can be written down as follows: 

1. Imagine that Si is placed at the center of the square lattice, and make an array of 
partial sums So(n) with relative displacements from this spin site. 

2. Pick randomly a spin and add its index into a stack. Determine a reflection plane by 
randomly drawing G [0, 2n). 

3. Retrieve an element i from the stack and do the following: 

(a) Draw a uniform random number u G [0, 1) and find an index w that satisfies 
So(w) < — \ m(l — u) + Sq(z) < So(w + 1). If there is no such w, terminate this 
loop for %. Otherwise, set z as w for the next iteration. 

(b) Because w indicates only a relative position with respect to i, translate it into 
the actual position w'. 

(c) Add w' into the stack with probability 



Go to step [3U 

4. If the stack is not empty, go to step [3j Otherwise, go to the next step. 

5. For every spin pair i,j inside the generated cluster C, calculate the energy difference 
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FIG. 2: The total number of flipped spins is denoted as nf and / is the number of cluster flips. 
We plot (a) the average number of updated spins per flip and (b) the magnetic order parameter as 
a function of n/, measured for L = 16 and temperature T = 0.7 in units of J/k. (c) Acceptance 
ratios of the cluster algorithm and the Metropolis algorithm. The system size is taken as L = 8. 
(d) Magnetic order parameter obtained by using the cluster algorithm. The dotted lines show 
results based on the Metropolis algorithm for comparison. 

6. For every spin pair j 6 C and j ^ C, calculate the energy difference 

Surface = [ 3 ( Sj ' r «)( a J ' ^ ~ 3 ( S * ' r <i)(*i " ^'j)] l r l- 

ij 

7. Flip the cluster with probability 

P acc = min {1, exp {-(AEU + AE^ {ace )/k B T]} . 



As one sees in Sec. Ill the energy contribution due to the anisotropy should be calculated 
inside the cluster and at its surface. If the generated cluster has a size c, the computation 
for the bulk part roughly takes c 2 /2 while the surface part needs c(N — c). In order for 
the whole system to be updated, this should be repeated N/c times. Hence, as a whole, 
it takes [c 2 + c(N — c)] x N/c = N 2 [l — ^] . In other words, 0(N 2 ) complexity does not 
disappear, but decreases to a limited extent. Figure EJja) shows how many spins one cluster 
flip actually updates, which is a small number. Here, the temperature T = 0.7 Jj k B is 
chosen to be around the order-disorder transition point [6|. If we measure time in terms 



of the number of flipped spins as in Ref. 15|, the magnitude of staggered magnetization, 



m, is observed to equilibrate substantially faster than the standard Metropolis algorithm 
[Fig. |2fb)]. Here, the staggered magnetization is defined as 



m = {m x ,m y ) 



with <Tj = [(— l) Vi cos6*i, (— l) Xi sin^j], where the position of each spin is given as = (xi,yi) 
and the spin variable is written as s, = (cos#j, sin^j) [16| . We take the magnitude m = \m\ 
as the order parameter of this dipole system. Figure MJo) implies that the global update 
indeed carries out nontrivial moves, even though this factor is largely compensated by the 
low acceptance ratio in practical computations [Fig [2(c)]. A trick to get a higher acceptance 
ratio is to perform occasionally such a move that rotates every spin in the generated cluster 
by 7r because this is the only possible global move that does not cause AEg u]k . However, 
this trivial move hardly makes any essential difference in performance. Figure E(d) shows 
the outcomes from this algorithm, as well as results based on the Metropolis algorithm for 
comparison. The nice agreement found in the order parameter confirms the validity of this 
cluster algorithm. 

The autocorrelation time r can be measured by integrating the autocorrelation for an 
equilibrated time series of m. In Fig. [31 we compare r of our cluster algorithm with that of 
the Metropolis algorithm. We have very limited sizes so it is not easy to quantify the critical 
behavior r ~ L z . For the Metropolis algorithm, however, r « L seems to be a plausible 
description up to the sizes used in this work [Fig. Ufa)] . On the other hand, the cluster 
algorithm shows only a little increase in r at L = 16 [Fig. [3(b)], which suggests that r can 
be a sublinear function of L. 

III. DISCUSSION 

A recent numerical observation based on extensive use of the Metropolis algorithm sug- 
gests that the order-disorder transition of the 2D square dipole lattice is consistent with 
the 2D Ising universality class |4(, which has been inconclusive to a large extent. We have 
reached the same conclusion by running the Metropolis algorithm on a number of CPU's in 
parallel [6|. When run on a single CPU, the agreement of the cluster-algorithm approach 
found in Fig. E^d) is striking, and this shows that the main obstacle in identifying the critical 
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FIG. 3: (a) Autocorrelation time of m at T = 0.7 J /ks in the Metropolis algorithm, where one 
Monte Carlo step is defined as attempting to flip every spin in the system, (b) The same quantity 
in the cluster algorithm, where one Monte Carlo step corresponds to a cluster generation. 

behavior has been the equilibration rate, as suggested in Ref. [5]. An efficient algorithm is, 
therefore, called for in order to obtain more precise critical properties of the dipole lattices, 
and we hope that this cluster algorithm can be a step toward it. 

Even though the complexity of 0(N 2 ) still remains in our cluster algorithm, it has an ad- 
vantage over the simple Metropolis algorithm when time is measured by spin flips [Fig[2^b)]. 
The problem is that it shows little gain in terms of real time due to the low acceptance 
ratio, which comes from collecting the anisotropic contributions. This possibly indicates a 
direction to improve this cluster-update approach. 
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